Introduction Morphologic complete remission (CR) or composite CR (cCR), requiring bone marrow evaluations (BMEs), remains the standard for response assessment in patients (pts) with myeloid neoplasias (MN). However, BMEs are conducted in only ~75% of pts in RCTs and ~55% in real-world evidence (RWE) studies [Pleyer 2025], highlighting the need for less invasive response criteria. Cox proportional hazards (CPH) feed-forward neural network (NN) analyses identified IWG23-adapted peripheral blood complete remission (PB-CR) to have prognostic equivalence to CR/cCR in hypomethylating agent (HMA)-treated cohorts [Pleyer L, AJH 2023, 98(11):1685-98; Pleyer L, HemaSphere 2024, 8(S1):1358-9; Bewersdorf JP, HemaSphere 2024, 8(S1):1367-8]. This supports PB-CR as a clinically meaningful, non-invasive alternative when BMEs are not performed.

This study builds on prior work while avoiding guarantee (immortal) time bias (GTB) -a critical issue in all time-to-event (TTE) analyses (including RCTs) when classification events (eg CR, cCR, EFS, DFS) occur after the start of follow-up (FU) [Suissa S, Am J Eoidemiol 2008, 167(4):492-9; Giobbie-Hurder A, JCO 2013, 31(23):2963-9]. GTB remains under-recognized by academia, industry, and regulators.

Methods Data were sourced from the Austrian Myeloid Registry (AMR, n=2,210; NCT04438889) and the phase-3 AML-001 RCT (n=488; NCT01074047, [Dombret H, Blood 2015, 126(3):291-9]).

We developed a novel survival regression model combining the semi-parametric CPH approach with a flexible deep-learning-based hazard estimator using a Long Short Term Memory (LSTM) architecture. Unlike most prior models [Katzman JL, BMC Med res Methodol 2018, 18(1):24; Zeng 2025], our model leverages the patients full longitudinal history—not just baseline or most recent data—for hazard estimation. Missing exogenous variables (but not outcomes) were imputed.

To assess the importance of BME-related variables for overall survival prediction, we performed a permutation ablation analysis. Goodness-of-fit was quantified using the time-dependent concordance index (C-index) [Antolini L, Stat Med 2005, 24(24):3927-77] and AUROC at 20 landmark-horizon pairs (landmarks at 0, 3, 6, 9, 12 months (mo); horizons +3, +6, +9, +12 mo). Feature importance was evaluated by permuting BME-related variables, running 100 random permutations, and comparing performance metrics to the base case. Empirical p-values were computed and adjusted using Benjamini–Yekutieli correction.

Results Data from 2,698 pts with MDS (568), MDS/MPN (47), CMML (388) and AML (1,695) at 1st line treatment start were included. Median FU was 8.8 mo. 4,164 treatment lines (26 intensive chemo, 7% venetoclax-based, 46% HMA-based, 15% others, 6% BSC) comprising 36,291 therapy cycles. The NN was trained on 56 covariates accros 40,767 timepoints, including 36,046 differential blood counts, 25,766 chemistry labs, and 5,626 BMEs.

Input variables included patient characteristics, treatment, comedications, blood count parameters, lab values, and BME-specific variables (BM blasts, IPSS, RIPSS, complex karyotype, ELN22 cytogenetic risk, number of cytogenetic/molecular aberrations, ELN22-cCR, IWG23-cCR), which were permutated for feature importance testing.

Ablating BME-related variables did not significantly reduce model performance in the full cohort (C-index before ablation: 0.795; median after ablation 0.793; p=0.089), or in AML-only analyses (C-index: 0.802 → 0.799; p=0.158). None of the 20 landmark-horizon comparisons showed performance loss post-ablation, indicating that longitudinal peripheral blood data alone retained predictive accuracy.

Conclusions Excluding BME-related data did not substantially impair survival model performance and thus survival prediction (C-index, AUROC), suggesting they may be dispensable when rich longitudinal peripheral blood data is available. This study demonstrates the possibilities offered by modern machine learning techniques in predicting patient risk on the basis of prospectively collected high quality registry and RCT data. Ongoing efforts include extended FU, incorporation of MDS-001 RCT data (n=358; NCT00071799 [Fenaux P, Lancet Oncol 2009, 10(3):223-32]), and improved interpretability using backward parameter selection, SeqShap [Yiang G, DASFAA 2024, 14853:89-104], as well as the conversion of IWG23-PB-CR to a non-binary response indicator. Future models will incorporate quality-of-life, adverse events, and raw NGS and flow cytometry data.

This content is only available as a PDF.
Sign in via your Institution